Finding statistically significant communities in networks 
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Community structure is one of the main structural features of networks, revealing both their in- 
ternal organization and the similarity of their elementary units. Despite the large variety of methods 
proposed to detect communities in graphs, there is a big need for multi-purpose techniques, able 
to handle different types of datasets and the subtleties of community structure. In this paper we 
present OSLOM (Order Statistics Local Optimization Method), the first method capable to detect 
clusters in networks accounting for edge directions, edge weights, overlapping communities, hierar- 
chies and community dynamics. It is based on the local optimization of a fitness function expressing 
the statistical significance of clusters with respect to random fiuctuations, which is estimated with 
tools of Extreme and Order Statistics. OSLOM can be used alone or as a refinement procedure of 
partitions/covers delivered by other techniques. We have also implemented sequential algorithms 
combining OSLOM with other fast techniques, so that the community structure of very large net- 
works can be uncovered. Our method has a comparable performance as the best existing algorithms 
on artificial benchmark graphs. Several applications on real networks are shown as well. OSLOM 
is implemented in a freely available software (http://www.oslom.org), and we believe it will be a 
valuable tool in the analysis of networks. 



PACS numbers: 89.75.Hc 

I. INTRODUCTION 

The analysis and modeling of networked datasets are 
probably the hottest research topics within the modern 
science of complex systems [1- 7 . The main reason is 
that, despite its simplicity, the network representation 
can disclose some relevant features of the system at large, 
involving its structure, its function, as well as the inter- 
play between structure and function. The elementary 
units of the system are reduced to simple points, called 
vertices (or nodes) ^ while their pairwise relationships/in- 
teractions are pictured as edges (or links). It is fairly 
easy to spot the two main ingredients of a graph in many 
instances. Therefore networks can be found everywhere: 
in biology (e. g., proteins and their interactions), ecology 
(e. g., species and their trophic interactions), society (e. 
g., people and their acquaintanceships). Other notewor- 
thy examples include the Internet (routers/autonomous 
systems and their physical and/or wireless connections), 
the World Wide Web (URLs and their hyperlinks), etc.. 

The structure of most networks, beneath the intrinsic 
disorder due to the stochastic character of their genera- 
tion mechanisms, reveals a high degree of organization. 
In particular, vertices with similar properties or function 
have a higher chance to be linked to each other than ran- 
dom pairs of vertices and tend to form highly cohesive 
subgraphs, which are called communities (also modules 
or clusters). Examples of communities are groups of mu- 
tual acquaintances in social networks [SHTO]. subsets of 
Web pages on the same subject [TT, compartments in 
food webs [121 IE] , functional modules in protein inter- 
action networks [M], biochemical pathways in metabolic 
networks [l5l[T6], etc.. 



Detecting communities in graphs may help to identify 
functional subunits of the system and to uncover simi- 
larities among vertices that are not apparent in the ab- 
sence of detailed (non-topological) information. Vertices 
belonging to the same community may be classified ac- 
cording to their structural position within the cluster, 
which may be correlated to their role. Vertices in the 
core of the cluster may have a function of control and 
stability within the module, whereas boundary vertices 
are likely to be mediators between different parts of the 
graph. The community structure of a network can also be 
a powerful visual representation of the system: instead 
of visualizing all the vertices and edges of the network 
(which is impossible on large systems), one could display 
its communities and their mutual connections, obtain- 
ing a far more compact and understandable description 
of the graph as a whole. It is thus not surprising that 
community detection in graphs has been so extensively 
investigated over the last few years [17]. A huge vari- 
ety of different methods have been designed by a truly 
interdisciplinary community of scholars, including physi- 
cists, computer scientists, mathematicians, biologists, en- 
gineers and social scientists. 

However, most algorithms currently available cannot 
handle important network features. Many methods are 
designed to find clusters in undirected graphs, and can- 
not be easily (or not at all) extended to directed graphs. 
However, there are many datasets for which edge direct- 
edness is an essential feature. Citation networks, food 
webs and the Web graph are but a few examples. Sim- 
ilar problems arise when edges carry weights, indicating 
the strength of the interaction/affinity between vertices, 
although extensions are generally easier in this case. 

Likewise, the great majority of algorithms are not ca- 
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pable to deal with the pecuhar features of community 
structure. For example, each vertex is typically assigned 
to a single cluster, while in several instances, like in so- 
cial networks, vertices are typically shared between two 
or more clusters. In such cases communities are overlap- 
ping (and partitions become covers) and very few meth- 
ods account for this possibility |18h25 . which consider- 
ably increases the complexity of the problem. Further- 
more, community structure is very often hierarchical, i.e. 
it consists of communities which include (or are included 
by) other communities. Hierarchies are common in hu- 
man societies and are crucial for an efficient management 
of large organizations. Simon pointed out that hierarchy 
gives robustness and stability to complex systems, yield- 
ing an evolutionary advantage on the long run [26^ . How- 
ever, most community finding methods typically look for 
the "best" partition of a network, disregarding the possi- 
ble existence of hierarchical structure. Instead, a method 
should be able to recognize if there is hierarchical struc- 
ture and, if yes, identify the corresponding levels [27H29] . 

It is also very important for a method to distinguish 
communities from pseudo-communities. The existence of 
clusters indicate a preference by some groups of vertices 
to link to each other. But, if the linking probability is 
the same for all pairs of vertices, like in random graphs, 
no communities are expected. In this case, concentra- 
tions of edges within groups of vertices are simply the 
result of random fluctuations, they do not represent po- 
tentially non-trivial structures. Many algorithms are not 
able to see this difference and find clusters in random 
graphs as well, although they are not meaningful. Schol- 
ars have just begun to assess the issue of significance of 
clusters [30, 31 . 

Finally, given the recent availability of time-stamped 
networked datasets, it is now possible to carry out quan- 
titative studies on the dynamics of community structure, 
about which very little is known [32H37] . A simple way 
to treat dynamic datasets is to analyze snapshots of the 
system at different times separately, and then map com- 
munities of different snapshots onto each other, such that 
one can follow the dynamic of each cluster in time. How- 
ever, focusing on individual snapshots means disregard- 
ing the information on the system at previous times. Ide- 
ally a partition/cover of the system at time t should be 
faithful both to its structure at time t and to its his- 
tory [3l[37]. 

In this paper we propose the first method able to meet 
all requirements listed above, the Order Statistics Lo- 
cal Optimization Method (OSLOM). It is a method that 
optimizes locally the statistical significance of clusters, 
defined with respect to a global null model. The concept 
of statistical significance is inspired by recent work of 
some of the authors [31, 38 . The paper is structured as 
follows. After introducing the method, we test its per- 
formance on artificial benchmark graphs, comparing it 
with the performances of the best algorithms currently 
available. Next, we pass to the analysis of real networks, 
followed by a final discussion on the work. Some of the 



tests on artificial and real networks are reported in the 
Appendix. 



II. METHODS 
A. Statistical significance of clusters 

In this section we explain how to estimate the statis- 
tical significance of a given cluster. OSLOM will use 
the significance as a fitness measure in order to evaluate 
the clusters. Following our previous work [31 , we define 
it as the probability of finding the cluster in a random 
null model, i. e. in a class of graphs without commu- 
nity structure. We choose the configuration model [39] 
as our null model. This is a model designed to build 
random networks with a given distribution of the num- 
ber of neighbors of a vertex (degree). The networks are 
generated by joining randomly vertices under the con- 
straint that each vertex has a fixed number of neighbors, 
taken from the pre-assigned degree distribution. This is 
basically the same null model adopted by Newman and 
Girvan to define modularity [40^ . 




Figure 1: A schematic representation of a subgraph C, whose 
significance is to be assessed. The subgraph C is embedded 
within a random graph generated by the configuration model. 
The degrees of all vertices of the network are fixed, in the 
figure we have highlighted the degrees of C (mc), of the vertex 
i at the center of the analysis (fc) and of the rest of the graph 
^ \ [C U {i}] (M). These quantities are expressed as sums of 
contributions which are internal to their own set of vertices 
(as M*) or related to subgraph C (in or out). This notation 
is used in the distribution of Eq. 1. 

We start from a graph Q with N vertices and E edges. 
The framework for the analysis is sketched in Fig. 1. 
We are given a subgraph C, whose significance is to be 
assessed, a vertex i ^ C and the degree of the vertices 



3 



of the rest of the graph G \ [C U {i}]. The degree of 
subgraph C is mc, h is the degree of i, and the rest of 
vertices have a total degree M. We can separate the 
above quantities in the contributions internal or external 
to C (m^^, mg^S and k^""^); the internal degree of ^ \ 
[CU {i}] is M* (Fig. 1). 

Let us suppose that C is a subgraph of graphs generated 
by the configuration model, where each vertex maintains 
the degree it has on the graph Q at study. We assume that 
the internal degree m^"^ of the subgraph is fixed. If all 
the other edges of the network are randomly drawn, the 
probability that i has /c-^ neighbors in C can be written 
as [38j 

2-kr 

p{k'^\i.C,g) = A^^^^^^ ^^^^^ _ ^^.^^j (^*/2)!- 

This equation enumerates the possible configurations of 
the network with A:-^ connections between i and C. The 
factorials of the formula express the multiplicity of config- 
urations with fixed values of kf^^ /c^^*, {m^^ — kf^) and 
M*/2, whereas the power of 2 in the numerator stays 
for the multiplicity coming from the permutation of the 
extremes of edges lying between i and C. Several of the 
terms in the expression can actually be written as a func- 
tion of constants and k]^ ^ such as k^^^ = ki — kf^ and 
M* = 2£: - mc - mg^^ - 2 A:^ + 2A:j^. The normaliza- 
tion factor A includes terms not depending on /c-^ and 
ensures that 

^ p{kr\i,c,G) = i. (2) 

ki'^:M*>0 

Further details on the numerical implementation of the 
formula in Eq. [T] as well as on the different approxima- 
tions taken and their limits, are included in Appendix [A| 
The probability of Eq. [T] provides a tool to rank the 
vertices external to C according to the likelihood of their 
topological relation with the group. If vertex i shares 
many more edges with the vertices of subgraph C than 
expected in the null model, we could consider the inclu- 
sion of z in C, since the relationship between i and C is 
"unexpectedly" strong. In order to perform the ranking 
the cumulative probability r{kl^) = Yl^Lk^^ p{j\h C^Q) of 
having a number of internal connections equal or larger 
than kf^ is estimated, following Ref. [31]. Given that the 
vertex degree is a discrete variable, the cumulative dis- 
tribution has a specific step-wise profile for each value of 
ki. In order to facilitate the comparison of vertices with 
different degrees, we implement a bootstrap strategy by 
assigning to each vertex i a value of r, r^, randomly drawn 
from the interval [r(kf^).,r(kf^ + !)]• This choice is im- 
portant for a meaningful estimate of the clusters' signif- 
icance; other options (e. g., taking the middle points of 
the interval) could lead to the identification of meaning- 
ful clusters in random graphs. The bootstrap introduces 
a stochastic element in the assessment procedure, which 
will, in turn, lead to the use of Monte Carlo techniques. 



The variable r bears the information regarding the like- 
lihood of the topological relation of each vertex with C 
and has an important feature: it is a uniform random 
variable distributed between zero and one for vertices 
of our null model graphs. Calculating its order statis- 
tic distributions is thus a relatively easy task. The first 
candidate among the external vertices to be part of C is 
the vertex with the lowest value of r, that we indicate 
ri. The cumulative distribution of ri in the null model 
is then given by 

1^1 (r) = P(ri < r) = 1 - (1 - r)^-^S (3) 

where nc is the number of vertices in C In general, let 
Tq be the value of variable r with rank q (in increasing 
order of the variable r). 




r 



Figure 2: Probability distributions of the scores r of vertices 
external to a given subgraph C of the graph. The score Vq 
is the ^-th smallest score of the external vertices. In this 
particular case there are 10 external vertices. In the figure, 
we plot p(ri), p(r2), ^(rs), p(r4), p{r^) (from left to right). As 
an example, the shaded areas show the cumulative probability 
for a few values of r that would correspond to the values 
estimated in a practical situation. In this case, the black area, 

= 4, is the least extensive and so Cm = ^^4- If (/>(cm) < 
the vertices with scores ri,r2, rs and r4 will be added to C. 

Its cumulative distribution is (Fig. 2): 

(4) 

The reason for the use of order statistics is that we 
assume that clustering methods tend to include in each 
community those vertices which are most strongly con- 
nected to vertices of the community. Due to correlations 
(the vertices in the clusters tend to be connected), we 
cannot calculate the statistics of the internal connections 
to the clusters, but we can do it safely for the exter- 
nal vertices. The values of the different Vtq inform us of 
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how much the external vertices of a group are compatible 
with the statistics expected in the null model. To evalu- 
ate the full group, we define = miiiq {Qq{rq)} among 
all the neighbors of C, where Tq are their corresponding 
ranked values for the r variable. The distribution of Cm 
can be easily tabulated numerically since it only depends 
on N — nc. The cumulative distribution will be denoted 
as P{cm < x) = (j){x^N — nc). In the following, we call 
^(c^, N — nc) the score of the cluster C. 



B. Single cluster analysis 

Now that a score to evaluate the statistical significance 
of the clusters has been introduced, the next step is to 
optimize the score across the network by dividing it into 
proper clusters. We describe first the optimization of a 
single cluster score and will extend later the method to 
deal with the full network. First of all one has to give the 
method a certain tolerance, in the following referred to as 
P. This parameter establishes when a given value of the 
score is considered significant. Our procedure consists of 
two phases: first, we explore the possibility of adding ex- 
ternal vertices to the subgraph C; second, non-significant 
vertices in C are pruned. They are described below and 
illustrated schematically in Fig. 3. 



Single Cluster 
Analysis 

Given a cluster C, the 
procedure averages 
over several output C 



Starting from C, collect a collection 
of clusters {C} running the 
Cleanin q-up procedure n times. 



The cluster is significant if C was 
not empty at least 1/2 of the runs. 



Cleaning-up 
procedure 

Given a cluster C, the 
routine compute a new 
cluster C". If C is not 
significant, C is left empty. 



STEP 1. Choose all nodes neighbors of C, 
add those to C selected by the ASN routine . , 

T f 



STEP 2. Move the worst node of C according to rj 
in C\{i} out of C. Use ASN routine to compute C. 



r If C is not empty, C is consider ^ 
[significant. The procedure stops. 



If C is empty, go back to STEP 
2, unless C is empty. 



Add Significant 
Neighbors 
CASN) routine 



Given C and its 
neighbors V, this 
routine adds nodes to 
C. It returns a score 
and a new cluster C. 



_^ Compute rj of all nodes of V using Eq74 



Sort Tj in increasing order. Starting from the 

lowest, increase i unless <t>[Cl^^ (rj , N -n^ )]<P 

(see Fig 2). Set q equal to the first i (if any) 
such that <D[Qj (rj , N -n^. )]<P 



If for all nodes 
0[Qj (rj,N-n^. )]>R 

C is left empty. 



If q has been set, join to C 
all nodes whose r are lower 



Figure 3: Schematic diagram of the single cluster analysis. 



1. For each vertex i outside C and connected to it 
by at least one edge the variable r is computed. 



Then we calculate ^i{r) for the vertex with the 
smallest r, by using Eq.jsj If <j){Qi{r), N — nc) < P, 
we add the corresponding vertex to the subgraph, 
which we now cah C^ If (l){Qi{r),N - nc) > P, 
one checks the second best vertex, the third best 
vertex, etc.. If there is finally a vertex, say the q- 
th best vertex, for which (j){Qq{r)^ N — nc) < P, 
one includes all q best vertices into subgraph C, 
yielding subgraph C^ At this point, no other vertex 
outside C deserves to enter the community since 
all the external vertices are compatible with the 
statistics of the random configuration model. It 
may also happen that the inequality (j) < P above 
holds for no external vertex, in which case we add 
no vertices to C and C = C. Either way, we pass to 
the second stage with the subgraph C^ 

2. For each vertex i in C the variable with respect 
to the set C'\{i} is estimated. We pick the "worst" 
vertex w of the cluster, i. e. the vertex with the 
highest value of r^. To check for its significance we 
repeat step 1 for the subgraph C \ {i}. If w turns 
out to be significant, we keep it inside C and the 
analysis of the cluster is completed. Otherwise, w 
is moved out of C and one searches for the worst 
internal vertex of C \ {w}. At some point we end 
up with a cluster C*, whose internal vertices are all 
significant and the process stops. 

The two-steps procedure is a way to "clean up" C. A 
cluster is left unchanged only if all the external vertices 
are compatible with the null model and all the internal 
vertices are not. A few remarks are important here: 

• There can be both good vertices outside C and bad 
ones inside. It is important to perform the complete 
procedure described above, which guarantees that 
the final cluster is significant with respect to the 
present null model (see also Ref. [31 ). 

• The procedure is not deterministic, because of the 
stochastic component in the computation of the cu- 
mulative probability r. So one shall repeat all the 
steps several times. The cluster analysis may de- 
liver a subgraph in general different from C, or 
an empty subgraph. For each vertex i we compute 
the participation frequency /i, defined as the ratio 
between the number of times i belongs to any non- 
empty C and the total number of iterations leading 
to non-empty subgraphs. In general, we consider 
the subgraph C to be a significant cluster if the sin- 
gle cluster analysis yields a non-empty subgraph C 
in more than 50% iterations. The final "cleaned" 
cluster includes those vertices for which fi > 0.5. 

• In the worst-case scenario, the complexity of the 
cluster analysis scales with the number of vertices 
of C, times the number of neighbors of C, times 
the number of loops needed to have reliable val- 
ues for the /i's. The situation can be considerably 
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improved by keeping track of the order of the exter- 
nal vertices at each step (using suitable data struc- 
tures) and by computing the score only for some 
reasonably good vertices. For instance, one could 
pick just those vertices with r < 0.1. We numeri- 
cally checked that changing this threshold does not 
affect the results, but leads to a faster algorithm. 



C. Network analysis 

The previous procedure deals with a single cluster C It 
finds the external significant vertices and includes them 
into C It also prunes those internal vertices that are not 
statistically relevant. Now we extend this procedure by 
introducing an algorithm able to analyze the full network. 
In order to do so, we follow the method proposed by 
some of the authors in Ref. [23]. The starting point is 
a single vertex, taken at random, in the absence of any 
information. Let us suppose that we start from a random 
vertex i and that our first group is C = {i}. The method 
proceeds as follows: 

1. q vertices are added to C, considering the most sig- 
nificant among the neighbors of the cluster. The 
number q is taken from a distribution, which in 
principle can be arbitrary. We choose a power law 
with exponent —3. 

2. Perform the single cluster analysis. 

We repeat the whole procedure starting from several ver- 
tices in order to explore different regions of the network. 
This yields a final set of clusters that may overlap. Such 
type of local optimization was originally implemented in 
the Local Fitness Method [23], to handle overlapping 
communities. The algorithm stops when it keeps find- 
ing similar modules over and over. Ideally one wishes 
to encounter the exact same clusters repeatedly. How- 
ever, the stochastic element introduced when calculating 
the vertex score can lead vertices, whose score is close 
to the threshold, to change their group assignments from 
one realization to another. This can be a problem when 
we are trying to decide whether two groups in different 
instances correspond to the same cluster. As a practi- 
cal rule, we say that two groups Ci and C2 are similar if 
|Ci UC2I/ min(|Ci|, IC2I) > 0.5, in which case they deserve 
further attention. Indeed, it turns out that many of the 
clusters found are very similar or combinations of each 
other. This leads to a very important question: given a 
set of significant clusters, which ones should be kept? 

Let us consider the problem of choosing between two 
clusters Ci and C2 and the union of the two, C3. A so- 
lution is to consider the subgraph Qs of the vertices in 
C3 and see if Ci and C2 are significant as modules of Qs- 
Strictly speaking we consider C[ and C2 which are the 
cleaned up clusters within G3 (i.e. with respect to sub- 
graph Qs only, neglecting the rest of the network). We 
discard C3 if \C[ U > P2 • IC3I, where we set P2 = 0.7. 



Otherwise we discard Ci and C2 and we keep the union 
C3. Instead, if we have to decide among a set of k clusters 
and their union, the condition to prefer the submodules 

is U^Cl > P2 ■ \Cu\. 

In general, we check if each cluster has significant sub- 
modules, by looking for modules in the subgraph given 
by the cluster and using the condition above to decide 
which ones to take. This leads to a set of significant 
minimal clusters, where minimal means that they have 
no significant internal cluster structure, according to the 
condition above. We also need to check whether unions 
of such minimal clusters do have internal cluster struc- 
ture, according to our rule, to decide whether the clusters 
have to be kept separated or merged. After doing this, 
we still end up with many similar modules. Given a pair 
of similar modules (in the sense defined above), we first 
check if their union has significant cluster structure: if it 
does not, we merge the two clusters, otherwise we sys- 
tematically prefer the bigger one (if they are equal-sized, 
we pick the cluster with smaller score). 

After the completion of this procedure, the output is 
a cover of the network. To reduce the stochasticity in- 
troduced by the bootstrap, the procedure is repeated in 
order to obtain several covers. All clusters of the covers 
are analyzed as described above to select among them 
the ones which will appear in the final output. 



D. OSLOM 

We have described the cleaning of a single cluster and 
how the full network is analyzed. In the following, all the 
ingredients are assembled together to form the algorithm 
that we call OSLOM (Order Statistics Local Optimiza- 
tion Method). 

A flux diagram summarizing how it works can be seen 
in Fig. 4. OSLOM consists of three phases: 

• First, it looks for significant clusters, until conver- 
gence; 

• Second, it analyzes the resulting set of clusters, try- 
ing to detect their internal structure or possible 
unions thereof; 

• Third, it detects the hierarchical structure of the 
clusters. 

To speed up the method, one can start from a given parti- 
tion/cover delivered by another (fast) algorithm or from 
a priori information. In those cases, the first step will be 
to clean up the given clusters. 

Once the set of minimal significant clusters has been 
found, the analysis of the hierarchies consists of the fol- 
lowing steps. We construct a new network formed by 
clusters, where each cluster is turned into a supervertex 
and there are edges between supervertices if the repre- 
sentative clusters are linked to each other. The resulting 
superedges are weighted by the number of edges between 
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loop over 
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Hierarchical analysis: 
building next network level 



edges, it is not related to the weight that edges may carry. 
If the original network is weighted, the rescaled weight 
of an edge is w/ivi • Vj)^ w being the weight of the edge 
in the network. Once the supernetwork has been built, 
one applies the method again, obtaining the second hier- 
archical level. The latter is turned again into a supernet- 
work, as we explained above, and so on, until the method 
produces no clusters. In this way OSLOM recovers the 
hierarchical community structure of the original graph. 

We will describe next the main features of OSLOM, 
and what it adds to the state of the art in community 
detection. 



1. Significant clusters 

The main characteristic of OSLOM is that it is based 
on a fitness measure, the score, that is tightly related to 
the significance of the clusters in the configuration model. 
In fact, the single cluster analysis is designed to optimize 
the cluster significance as defined in Ref. [31 . Therefore 
the output of OSLOM consists of clusters that are un- 
likely to be found in an equivalent random graph with 
the same degree sequence. The tolerance P, fixed ini- 
tially, determines whether such clusters are "unexpect- 
edly unlikely", and therefore significant, or not. So, if 
the method is fed with a random graph, the output will 
include very few clusters or even none at all. 



2. Homeless vertices 



Figure 4: Flux diagram of OSLOM. The levels of grey of 
the squares represent different loop levels. One can provide 
an initial partition/cover as input, from which the algorithm 
starts operating, or no input, in which case the algorithm 
will build the clusters about individual vertices, chosen at 
random. OSLOM performs first a cleaning procedure of the 
clusters, followed by a check of their internal structure and by 
a decision on possible cluster unions. This is repeated with 
different choices of random numbers in order to obtain better 
statistics and a more reliable information. The final step is to 
generate a super-network for the next level of the hierarchical 
analysis. 



The vertices in a random network will be deemed as 
homeless. Homeless vertices are those that are not as- 
signed to any cluster. This is a very important feature 
that OSLOM includes. The presence of random noise 
or non- significant vertices is an issue that may occur in 
many real systems. However, very few clustering tech- 
niques take into account this possibility. In OSLOM, it 
comes as a natural output. We will quantitatively ana- 
lyze this feature when we test the method on benchmark 
graphs. 



the initial clusters. There is the problem of properly as- 
signing edges between clusters, if the edges are incident 
on overlapping vertices. Suppose to have an edge whose 
endvertices i and j belong to Ui and uj clusters, respec- 
tively. This edge lies simultaneously between any pair of 
clusters Ci and Cj, with Ci including i and Cj including j. 
The contribution of the edge to the superedge between 
Ci and Cj equals l/ii^i ■ fj). The resulting non-integer 
weights may lead to non-integer values for the weight of 
superedges, whereas we need integer values in order to 
use Eq. [l] For this reason, the weight of each superedge 
is rounded to the nearest integer value. We stress that 
the weight we deal with here indicates just how to "split" 



3. Overlapping communities 

A natural output of OSLOM is the possibility for clus- 
ters to overlap. Since each cluster is "cleaned" indepen- 
dently of the others, a fraction of its vertices may belong 
also to other clusters, eventually. We will show the ef- 
ficiency of OSLOM in unveiling overlapping vertices in 
suitably designed benchmarks. 

4. Cluster hierarchy 

Another relevant feature of OSLOM is the analysis of 
the hierarchical structure of the clusters. As mentioned 
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above, the third phase of our method includes a proce- 
dure to take care of this issue. The results are very good 
on hierarchical benchmarks. 

OSLOM generally finds different depths in different 
hierarchical branches. In fact, when the algorithm is 
applied not all vertices are grouped, as some of them 
are homeless. The coexistence of homeless vertices 
with proper clusters yields a hierarchical structure with 
branches of different depths. 



5. Weighted networks 

OSLOM can be generalized to weighted graphs as well. 
We assume that the contributions to the probability of 
having a connection between two vertices i and j with 
a certain weight Wij^ given the vertex degrees ki and kj 
and their strengths, Si and 5j, is separable in two different 
terms in the configuration model: one for the topology 
and another for the weight [38]. The strength of a ver- 
tex is defined as the sum of the weights of all the edges 
incident on it. We approximate the weight contribution 
by 

piwij > x\ki,kj,Si,Sj) = ex.-p{-x/{wij)), (5) 

where (wij) = 2{wi){wj) /{{wi) + is the harmonic 

mean of the average weights of vertices i and j, defined 
as {wi) = Si/ki and (wj) = Sj/kj^ respectively. The idea 
behind this expression is that the weight of an edge of the 
null model should be proportional to the average weight 
of its endvertices. We proposed the harmonic average 
because it is more sensitive to the small values of (w). 

We use this distribution to define a new variable r^, 
accounting for the probability of having a certain weight 
on a given edge with the strengths of the vertices and 
the general weight distribution known. We combine this 
variable r^^ with its topological counterpart, r^, obtaining 
a new variable r^t- This is a non-trivial task since both 
probabilities are defined on a different set of elements 
(see the Supporting Information SI). For r^t we can 
estimate, as before, the order statistic distributions and 
we proceed just as we do for unweighted graphs. 



6. Directed graphs 

OSLOM can be easily generalized to handle directed 
graphs. For that, we need to define two uniformly dis- 
tributed random variables Tout and r^n- The former is 
based on the probability that vertex i has outgoing edges 
ending on vertices of the given subgraph C, the latter is 
based on the probability that i has incoming edges orig- 
inating from vertices of C. These two probabilities are 
computed through analogous formulas as in Eq. [l] or nu- 
merical approximations to it. The final score of vertex 
i is given by the product Tin • Vout- We are able to cal- 
culate the distribution of this product and therefore to 



estimate its order statistics (just as for the weighted case, 
see Section 1.1. of Supporting Information SI). The rest 
of the clustering method proceeds as explained above. If 
graphs have edges with both directions and weights, we 
have four variables for each vertex: r^^, Tout and the cor- 
responding versions for the weights. The final score is 
given again by the product of these four variables. 



7. Dynamical networks 

Time-stamped networked datasets are usually divided 
into snapshots, condensing the relational information be- 
tween vertices within different time windows. Snapshots 
are typically analyzed separately, whereas it would be 
more informative to combine the information from dif- 
ferent time slices. For instance, consider two snapshots 
Qt and Qt-\-At at times t and t + At, respectively. A 
simple idea is to find the partition/cover of the network 
at time t, by applying the method to the corresponding 
snapshot, and to use the result as an input for the ap- 
plication of the method to the network at time t + At. 
In this way one can see how the community structure at 
time t "evolves" to that at time t -\- At. This is a rather 
general approach, it can be adopted for other algorithms 
for community detection, like greedy optimization tech- 
niques. OSLOM has the useful property that it can start 
from any initial partition/cover, which can be given as in- 
put. In this way the clusters found in Qt can be used as 
initial condition for the analysis of Qt-\-At- With this ap- 
proach, the new partition/cover is closer to that in Qt and 
we are able to track the groups' evolution. Naturally, if 
the two snapshots are very different from each other (be- 
cause they refer to times between which the system has 
changed considerably, for instance), OSLOM produces a 
partition/cover in Gt-\-At that is uncorrelated with that 
of St. 



8. Complexity 

The complexity of OSLOM cannot be estimated ex- 
actly, as it depends on the specific features of the com- 
munity structure at study. Therefore we carried out a nu- 
merical study of the complexity, whose results are shown 
in Fig. 5. 

We apply the method on the LFR benchmark \4T\ . 
that we have used extensively to test the performance of 
OSLOM. We have used both the standard version of the 
algorithm and a fast implementation, in which the algo- 
rithm acts on the partition delivered by a quick method. 
For each version we have considered undirected and un- 
weighted LFR benchmark graphs with two different levels 
of mixtures between the clusters (/i = 0.1 and /i = 0.6, 
corresponding to well separated and well mixed clusters) . 
The other parameters needed to build the LFR bench- 
mark graphs are the same as for the graphs used in Fig. 
6. The diagram of Fig. 5 shows the execution time (in 
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Network size N 

Figure 5: Complexity of OSLOM. The diagram shows how the 
execution time of two different implementations of the algo- 
rithm scales with the network size (expressed by the number 
of vertices), for LFR benchmark graphs. 



seconds) as a function of the number N of vertices of 
the graphs. The processes were run on a workstation HP 
Z800. The time scales as a power law of N with good 
approximation, if the graphs are not too small. The be- 
havior seems to depend neither on how mixed commu- 
nities are, nor on the particular implementation of the 
algorithm (there seems to be just a factor between the 
corresponding curves). Power law fits of the large-N por- 
tion of the curves yield an exponent 1.1(1), which implies 
that the complexity is essentially linear in this case. 



III. RESULTS 
A. Artificial networks 

In this section we test OSLOM against artificial bench- 
marks, comparing its performance with those of the best 
algorithms currently available. We mostly adopted the 
LFR benchmark [HJ [42] , a class of graphs with planted 
community structure and heterogeneous distributions of 
vertex degree and community size. Tests on the well 
known Girvan-Newman (GN) benchmark [8 are shown in 
the Supporting Information SI. In this section we present 
tests on undirected and unweighted networks, with and 
without hierarchical structure and overlapping commu- 
nities. We also show how OSLOM handles the presence 
of randomness in the graph structure. Tests on weighted 
networks and on directed networks can be found in the 
Supporting Information SI. 

In the following sections, for each network, we compose 
the results of 10 iterations for the network analysis for 
the first hierarchical level and the results of 50 iterations 
for higher levels, if any. The single cluster analysis was 



repeated 100 times for each cluster. 



1. LFR benchmark 

The LFR benchmark ^|^, like the GN benchmark, 
is a particular case of the planted £-partition model [43] , 
which is the simplest possible model of networks with 
communities. The planted ^-partition model is a class 
of graphs whose vertices are divided into £ equal-sized 
groups, such that the probability that two vertices of the 
same group are linked is p, while the probability that two 
vertices of different groups are linked is with p > q. 
The planted ^-partition model is too simple to describe 
real networks. Vertices have essentially the same degree 
and communities have the same size, at odds with em- 
pirical analysis showing that both features typically are 
broadly distributed [19l I44H48] . Therefore we have re- 
cently proposed a generalization of the model, the LFR 
benchmark, by introducing power-law distributions for 
the vertex degree and the community size, with expo- 
nents Ti and r2, respectively [41 . The LFR bench- 
mark poses a far harder challenge to algorithms than the 
benchmark by Girvan and Newman, which is regularly 
used in the literature, and is more suitable to spot their 
limits. We are of course aware that the communities of 
the model are still too simple to match the communities 
of real networks. Other features should be introduced, 
to tailor the model graphs onto the real graphs. This is 
certainly doable, and could be specialized to the particu- 
lar domain of applicability one is interested in. Still, the 
clusters of the LFR benchmark are a much better proxy 
of real communities than the clusters of other benchmark 
graphs. 

Vertices of the LFR benchmark have a fixed degree (in 
this case taken from the given power law distribution), 
so the two parameters p and q of the planted ^-partition 
model are not independent and we choose as indepen- 
dent variable the mixing parameter /i, which is the ratio 
of the number of external neighbors of a vertex by the 
total degree of the vertex. Small values of ja indicate well 
separated clusters, whereas for higher and higher values 
communities become more and more mixed to each other. 

As a term of comparison we used Infomap [49], which 
has proved to be very accurate on artificial benchmark 
graphs [50 . Fig. 6 shows the comparative performance 
of OSLOM and Infomap on the LFR benchmark, with 
undirected and unweighted edges and non-overlapping 
clusters. As a measure of similarity between the planted 
partition and that recovered by the algorithm we adopted 
the Normalized Mutual Information (NMI) [5Tj, in the 
extended version proposed in Ref. [23 , which enables one 
to compare both partitions and covers. We used this def- 
inition also for hard planted partitions, since modules 
found by OSLOM may be overlapping. In all tests on 
artificial graphs each point is always an average over 100 
realizations. 

The plots correspond to two network sizes, N = 1000 
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Figure 6: Tests on undirected and unweighted LFR bench- 
mark graphs without overlapping communities. The param- 
eters of the graphs are: average degree (k) = 20, maximum 
degree kmax = 50, exponents of the power law distributions 
are n = 2 for degree and r2 = 1 for community size, S and B 
mean that community sizes are in the range [10, 50] ("small") 
and [20,100] ("big"), respectively. We considered two net- 
work sizes: N = 1000 (top) and N = 5000 (bottom). The two 
curves refer to OSLOM (diamonds) and Infomap (circles). 



Figure 7: Tests on large undirected and unweighted LFR 
benchmark graphs without overlapping communities. The 
network sizes are N = 50000 (left) and N = 100000 (right), 
the maximum degree kmax = 200 and the community size 
ranges from 20 to 1000. The other parameters are the same 
as those used for the graphs of Fig. 6. The two curves refer 
to OSLOM (diamonds) and Infomap (circles). 



and N = 5000, and two ranges of community size, [10, 50] 
("small") and [20, 100] ("big"), that we indicate with the 
letters S and B, respectively. In this way we can check 
how much the performance of the algorithm is affected by 
the network size and the average size of the communities. 
The other network parameters are given in the caption. 
From the plots we conclude that OSLOM and Infomap 
have a basically equivalent performance. 

It is important to test the performance of the algo- 
rithms on large graphs as well, given the increasing avail- 
ability of large networked datasets. The question is if and 
how their performance is affected by the network size. 
Fig. 7 shows that both OSLOM and Infomap are effective 
at finding communities on large LFR graphs. We remark 
that the inferior accuracy of OSLOM when communities 
are better defined comes from the fact that the method 
occasionally finds homeless vertices, i.e. vertices that are 
not significantly linked to any cluster. These are vertices 
that happen not to have a significant excess of neighbors 
within their community with respect to the number of 
neighbors in the other communities, despite the fact that 
the average number of internal neighbors is high. This 
happens because of fluctuations, and the method judges 
such vertices as not belonging to any group, which makes 
sense. This issue of the homeless vertices is a general 
feature of OSLOM. One should not judge it negatively, 
though. If a vertex i happens to have a number of ex- 
ternal neighbors which is appreciably higher than the ex- 
pected external degree of the vertex /i ki , the condition 
p > q of the planted ^-partition model does not hold, so 



in principle the vertex should not be put in its original 
community. The confusion derives from the fact that the 
condition p > q holds on average. 



2. LFR benchmark with overlapping communities 

The LFR benchmark also accounts for overlapping 
communities, by assigning to each vertex an equal num- 
ber of neighbors in different clusters [42]. To simplify 
things, we assume that each vertex belongs to the same 
number of communities. We cannot use Infomap for 
the comparison, as it delivers "hard" partitions, without 
overlaps between clusters. So we used two recent meth- 
ods, that have a good performance on LFR graphs with 
overlapping communities: COPRA [52|, based on label 
propagation [53 , and MOSES [54^, based on stochastic 
block modeling [55 . COPRA and MOSES are more ef- 
ficient to detect overlapping communities in LFR bench- 
mark graphs than the popular Clique Percolation Method 
(CPM) [19], which is the reason why we do not use the 
CPM here. In Fig. 8 we show how the performance 
of each method decays with the fraction of overlapping 
vertices, for different choices of the mixing parameter 
and for the small (S) and big (B) communities defined 
above. Since in social networks there may be many ver- 
tices belonging to several groups, we also considered the 
extreme situation of graphs consisting entirely of overlap- 
ping vertices. In this case, by increasing the number of 
memberships of the vertices communities become more 
fuzzy and it gets harder and harder for any method to 
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Figure 8: Test on undirected and unweighted LFR benchmark with overlapping communities. The parameters are: N = 1000, 
(k) = 20, kmax — 50, Ti = 2, T2 = 1. S and B indicate the usual ranges of community sizes we use: [10,50] and [20,100], 
respectively . We tested OSLOM against two recent methods to find covers in graphs: COPRA [52 and MOSES [54 . The left 
panel displays the normalized mutual information (NMI) between the planted cover and the one recovered by the algorithm, 
as a function of the fraction of overlapping vertices. Each overlapping vertex is shared between two clusters. The four curves 
correspond to different values of the mixing parameter /i (0.1 and 0.3) and to the community size ranges S and B. The right 
panel shows a test on graphs whose vertices are all shared between clusters. Each vertex is member of the same number of 
clusters. The plot shows the NMI as a function of the number of memberships of the vertices. Each curve corresponds to a 
given value of the average degree {k). The graph parameters are N — 2000, kmax — 60, /i = 0.2, n — 2^ T2 — 1. Community 
sizes are in the range [20, 50]. 



correctly identify the modules. From Fig. 8 we deduce 
that OSLOM significantly outperforms COPRA in both 
tests and MOSES in the test with overlapping and non- 
overlapping vertices, while the performances of OSLOM 
and MOSES are quite close when all vertices are overlap- 
ping. 



3. Hierarchical LFR benchmark 

OSLOM is capable to handle hierarchical community 
structure as well. To test its performance we have de- 
signed an algorithm that produces a version of the LFR 
benchmark with hierarchy. To keep things simple, we 
consider a two- level hierarchical structure (Fig. 9). The 
idea is to use the wiring procedure of the original algo- 



rithm twice, first for the micro-communities and then for 
the macro-communities. In order to do so, we need two 
mixing parameters: /ii, the fraction of neighbors of each 
vertex belonging to different macro-communities; 112^ the 
fraction of neighbors of each vertex belonging to the same 
macro-community but to different micro-communities. 
The question is whether the algorithm is able to re- 
cover both planted partitions of the benchmark, which 
we call Fine (micro-communities) and Coarse (macro- 
communities). The partitions found by the algorithm 
can be one, two or more, we call them partition 1,2,3.... 
In the test, whose results are illustrated in Fig. 10, we 
compare the Fine partition with partition 1 (Fine 1), the 
Coarse partition with partition 2 (Coarse 2), and the 
Coarse partition with partition 1 (Coarse 1). We compare 
OSLOM with a recent extension of Infomap to networks 
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Figure 9: A realization of the hierarchical LFR benchmark 
with two levels. Stars indicate overlapping vertices. 



with hierarchical community structure [56]. In the plots 
we show how the similarity of the three pairs of partitions 
mentioned above varies by increasing /i2 but keeping fii 
constant (we picked the values /ii =0, 0.1, 0.2, 0.3). For 
a better comparison of the panels we put on the x-axis 
the sum /ii + /i2, representing the fraction of neighbors 
of a vertex not belonging to its micro-community. We 
find that, when /i2 increases, the Fine partition becomes 
difficult to resolve and, for /ii + /i2 ^ 0.7, it cannot be 
found anymore and both algorithms can only find the 
Coarse partition. Instead, for smaller value of /i2 5 the 
algorithms can recover both levels. OSLOM performs 
better than Infomap if /ii is not too small. 



4. Random graphs and noise 

We check whether OSLOM is also able to recognize 
the absence^ and not simply the presence, of community 
structure. In random graphs vertices are connected to 
each other at random, modulo some basic constraints 
like, e. g., keeping some prescribed degree distribution 
or sequence. In this way, there are by definition no groups 
of vertices that preferentially link to each other, so there 
are no communities. There may be subgraphs with an in- 
ternal edge density higher than the average edge density 
of the whole network, but they originate from stochas- 
tic fluctuations (noise). A good community finding al- 
gorithm should be able to recognize that such subgraphs 
are false positives, and discard them. Here we want to see 
if OSLOM distinguishes "order" from "noise". For this 




Mixing parameter + ii^ 



Figure 10: Test on hierarchical LFR benchmark graphs (un- 
weighted, undirected and without overlapping clusters). We 
compare three pairs of partitions: the lowest hierarchical par- 
tition found by the algorithm (indicated by 1) with the set 
of micro-communities of the benchmark (Fine); the lowest hi- 
erarchical partition found by the algorithm with the set of 
macro-communities of the benchmark (Coarse); the second 
lowest hierarchical partition found by the algorithm (indi- 
cated by 2) with the set of macro-communities of the bench- 
mark. The corresponding similarities are plotted as a function 
of /ii +/i2, for fixed /xi. There are 10000 vertices, the average 
degree (k) = 20, the maximum degree kmax = 100, the size 
of the macro-communities lies between 400 and 4000 vertices, 
the size of the micro-communities lies between 10 and 100 
vertices. The exponents of the degree and community size 
distributions are n = 2 and r2 = 1. 



purpose, we carried out two tests. In Fig. 11 we applied 
OSLOM and Infomap to Erdos-Renyi random graphs [57 
and scale- free networks [58 . The goal is to see whether 
the algorithms recognize that there are no actual com- 
munities. Good answers are the partition with as many 
communities as vertices, or the partition with all vertices 
in the same community. Let us call V the partition found 
by the algorithm at hand. Clusters in V containing at 
least two vertices and smaller than the whole network 
indicate that the method has been fooled. The fraction 
of graph vertices belonging to those clusters is a measure 
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Figure 11: Test on random graphs. We plot the fraction of 
vertices belonging to non-trivial clusters (i.e. to clusters with 
more than one and less than N vertices, where N is as usual 
the size of the graph), as a function of the average degree of 
the graph. The curves correspond to Erdos-Renyi graphs (di- 
amonds) and scale- free networks (circles). All graphs have 
N — 1000 vertices. The only parameter needed to build 
Erdos-Renyi graphs is the probability that a pair of ver- 
tices is connected, which is determined by the average degree 
{k). The scale- free networks were built with the configura- 
tion model [39] , starting from a fixed degree sequence for the 
vertices obeying the predefinite power law distribution. The 
parameters of the distribution are: degree exponent 7 = 2, 
maximum degree kmax — 200. 



of reliability: the lower this number, the better the algo- 
rithm. In Fig. 11 we show this variable as a function of 
the average degree {k) of the random graphs we consid- 
ered. For OSLOM it remains very low for all values of 
{k). This is not surprising, since OSLOM estimates the 
statistical significance of clusters, and is therefore ideal 
to detect stochastic fluctuations. Infomap instead finds 
many non-trivial clusters when {k) is low, whereas it cor- 
rectly recognizes the absence of community structure if 
{k) increases. 

The second test deals with graphs consisting of an or- 
dered part, with well-defined clusters, and a noisy part, 
consisting of vertices randomly attached to the rest of the 
network. The ordered part is an LFR benchmark graph 
with 1000 vertices and represents the starting configu- 
ration of our system. The noisy vertices (up to 2000 in 
number) are successively added in sequence, and a newly 
added vertex is linked to the other ones via preferential 
attachment [58^. The initial degree of the noisy vertices 
is drawn from a power law distribution with kmax = 100 
and exponent 3. We measure two things, as a function of 
the number of noisy vertices: the similarity between the 
set of noisy vertices and the set of homeless vertices found 
by OSLOM, which is expressed by the Jaccard Index [59] 
(Fig. 12, left); the similarity between the planted parti- 



Figure 12: Test on graphs including communities and noise. 
The communities are those of an LFR benchmark graph 
(undirected, unweighted and without overlapping clusters), 
with N 1000, {k) 20, k^ax = 50, fi = 0.2. The clus- 
ter size ranges from 10 to 50 vertices. The noise comes by 
adding vertices which are randomly linked to the existing ver- 
tices, via preferential attachment. The test consists in check- 
ing whether the community finding algorithm at study (here 
OSLOM, Infomap and COPRA) is able to find the commu- 
nities of the planted partition of the LFR benchmark and to 
recognize as homeless the other vertices. 



tion of the ordered part of the graph and the subset of the 
partition found by OSLOM including (only) the vertices 
of the ordered part, which is expressed by the normal- 
ized mutual information (Fig. 12, right). We compare 
OSLOM with Infomap and COPRA [52 . We find that 
OSLOM correctly separates the clusters and the noise 
up to a number of about 300 noisy vertices, which repre- 
sent almost a third of the whole network. Infomap and 
COPRA, instead, do not recognize the noisy vertices, no 
matter how small their number is. Also, they tend to mix 
noisy vertices with the clusters of the planted partition 
of the ordered part, as shown by the fact that the parti- 
tion they recover never exactly match the planted parti- 
tion, not even when just a few noisy vertices are present. 
These results are actually understandable in the case of 
Infomap, which is based on the minimization of the code 
length required to describe random walks taking place on 
the graph: singletons (clusters consisting of single ver- 
tices) are generally not admitted because they increase 
the amount of information required to map the process, 
due to the high number of transitions of the walker from 
the singletons to the rest of the graph and back. 



B. Real networks 

In this section we discuss the application of OSLOM to 
networks from the real world. In Table 1 we list the net- 
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Figure 13: Application of OSLOM to real networks: the word association network. Stars indicate overlapping vertices. 



works considered in our analysis, along with some basic 
statistics obtained from the detection of their community 
structure with OSLOM. 

We analyzed different types of systems: social, infor- 
mation, biological and infrastructural networks. Here we 
discuss only some of them, the rest of the analysis can 
be found in Appendix [P] 



1. The word association network 

This network is built on the University of South Florida 
Free Association Norms [60]. Here the presence of an 
edge between words A and B indicates that some people 
associate B to the word A. This network is considered a 
paradigmatic example of graph with overlapping commu- 
nities [19 , since several words may have various meanings 
and belong to different groups of words. In Fig. 13 we 
see a few subgraphs of the word association network, re- 
volving around four keywords: bright^ knowledge^ music 
and play. We see that the keywords are shared among 
several clusters, which are semantically highly homoge- 
neous. For instance, bright belongs to three groups, cen- 
tered on the words color^ shine and smarts respectively, 
which makes sense. In the same subgraph, the words sun 
and dark are also overlapping vertices, belonging to the 



groups of color and shine^ as one might expect. In the 
subgraph centered on knowledge^ one distinguishes the 
groups referring to the words mind, intelligent, expert 
and college/ university. Here there are many overlapping 
vertices, like the word intelligence, shared between the 
groups of mind and intelligent, and a bunch of terms indi- 
cating (mostly) professional status within schools and/or 
universities, like student, professor, teacher, etc., which 
lie between the groups of expert and college/ university. 
In the third subgraph, the word music is shared by the 
groups of instrument, song/ dance and noise/ sound: other 
overlapping vertices are the words sing and voice, lying 
between song/ dance and noise/ sound, and the words bass 
and saxophone, belonging to the groups of song/ dance 
and instrument. Finally, the word play sits between the 
communities of sport, music and youth/ kid; other over- 
lapping vertices in this subgraph include game, children, 
toy, etc.. 



2. UK commuting 

This is the network of flows of commuters between 
areas of the United Kingdom, and therefore it has a 
clearly geographic character. It is composed of 10 608 
vertices, each representing a ward, i. e. a geographi- 
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N 
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(k) 


Nc 


{s) 


(m) 


fh 


Zachary's club 


34 


78 


4.59 


2 


17.0 


1.03 


0.0294 


Dolphins 


62 


159 


5.13 


2 


32.5 


1.08 


0.0322 


Football 


115 


613 


10.7 


11 


10.0 


1.00 


0.0434 


UK commuting 


10 608 


1220 337 


230.07 


248 


45.43 


1.06 


0.00386 


C. elegans 


453 


2 025 


8.94 


25 


17.04 


1.22 


0.229 


Word association 


7 207 


31784 


8.82 


261 


22.48 


1.35 


0.395 


Live Journal 


4 846 609 


42 851237 


17.6 


407451 


10.01 


1.19 


0.294 


www. uk 


18 484117 


292 244 462 


15.81 


590 257 


28.08 


1.02 


0.125 


US airports 2009 (jan) 


448 


7659 


34.19 


11 


33.81 


1.28 


0.352 


US airports 2009 (mar) 


456 


8 491 


37.24 


6 


67.83 


1.22 


0.272 


US airports 2009 (jun) 


453 


8 480 


37.42 


9 


45.33 


1.28 


0.315 


US airports 2009 (sep) 


452 


7870 


34.81 


9 


41.55 


1.26 


0.347 



Table I: Basic statistics of the real networks we analyzed, including the main features of their community structure, detected 
by OSLOM. From left to right, we list the number of vertices N and edges E, the average degree (/c), the number of clusters 
Nc, the average cluster size (s), the average number of memberships per vertex (m) and the fraction fh of vertices not assigned 
to any cluster (homeless vertices). The values related to the community structure refer to the lowest hierarchical level. 




Figure 14: Application of OSLOM to real networks: flows of commuters in the UK. Black points indicate overlapping vertices. 



cal division used in the UK census for statistical pur- 
poses. The whole territory of the United Kingdom is 
divided into wards. Each edge corresponds to a flow 
of commuters between the ward of origin and that of 
destination, with a weight accounting for the number of 
commuters per day. The data were collected during the 
2001 UK census, when the ward of residence and the 
ward of work/study was registered for a sizeable part of 
the British population. The database can be accessed 
online at the site of the Office for National Statistics 
|http : //www . ons . gov . uk/census , OSLOM finds three 
hierarchical levels (Fig. 14). The clusters of the second 
level delimit geographical areas typically centered about 
one major town. In the highest level the areas of Eng- 
land, Wales, Scotland and Northern Ireland are clearly 
recognizable. Interestingly, Northern Ireland and Scot- 
land are parts of the same community, due to the large 
flow of commuters between the two regions, despite the 
geographical separation. Black points represent overlap- 
ping vertices. 



3. Live Journal and UK Web 

We also applied OSLOM to two large networks. 
The first is a network of friendship relationships 
between users of the on-line community LiveJour- 
nal (www. live journal . com), and was downloaded 
from the Stanford Large Network Dataset Collection 
(http://snap.stanford.edu/data/). The second is a 
crawl of the Web graph carried out by the Stanford Web- 
Base Project (http://dbpubs.stanford.edu:8091/j 
'^testbed/doc2/WebBase/), within the UK domain 
( . uk) . We remind that the Web graph is a directed 
graph whose vertices are Web pages, while the edges 
are the hyperlinks that enable one to surf from one 
page to another. These two systems are too large for 
OSLOM, due to the huge variety of possible cluster sizes 
to explore. Therefore we applied a two-step method: 
in the first step, we derived an initial partition 
with the Louvain method [61], which is able to handle 
large networked datasets; in the second step, we apply 
OSLOM to refine the clusters of . In principle, this 



15 



10 

10-^ 

GO 

10-* 

10-'" 





1 1 1 1 

LiveJournal = 


; ■ Infomap 
♦ LPM 

A OSLOM 1 
: • OSLOM 2 


^- 




1 1 1 1 



10> 



10"^ 



10 



I I ' 

web uk , 



A OSLOMl 
• OSLOM 2 



A»« 

A# , 



...... 6l0 ....... 

10 10 10 10 10 10 10 10 10 10 10 10 10 10 
Module size s 



Figure 15: Application of OSLOM to real networks: friend- 
ships of LiveJournal users (left) and sample of the . uk domain 
of the Web graph (right). We show the distribution of cluster 
sizes obtained by OSLOM for the first two hierarchical levels 
(OSLOM 1 and OSLOM 2). For LiveJournal we can compare 
the distributions with those found with Infomap [49] and the 
Label Propagation Method (LPM) by Leung et al. [62) . 



procedure should yield the same partitions/covers as 
applying OSLOM directly, if one repeated OSLOM's 
cluster search many times. But this would make the 
calculations too lengthy, so, in order to complete the 
analysis within a reasonable time, it is necessary to keep 
the number of iterations low. In this way there is the 
big advantage of drastically reducing the computational 
complexity, which makes large systems tractable, even 
if results would be more accurate if one could apply 
OSLOM from scratch. Clearly, since different iterations 
are independent processes, one could sensibly increase 
the statistics by distributing the iterations among 
different processors, if available. 

In Fig. 15 we present the distribution of cluster sizes 
of the first two hierarchical levels found by OSLOM. The 
results are obtained by performing a single iteration on a 
workstation HP Z800. For the Web graph, which is the 
larger system, with nearly 20 million vertices and 300 
million edges (see Table 1), the analysis was completed 
in about 40 hours. For the social network of LiveJournal 
we can compare the results with the corresponding dis- 
tributions found by Infomap and the Label Propagation 
Method (LPM) proposed by Leung et al. [62 , which were 
computed in a recent analysis [48 . In that work the orig- 
inal Infomap was used, so neither Infomap nor the LPM 
could detect hierarchical community structure and there 
is just one cluster size distribution, corresponding to the 
single partition recovered. The distributions are broad 
and quite similar across different methods. Interestingly, 
the two hierarchical levels of LiveJournal (OSLOM 1 and 
OSLOM 2) are not too different, indicating a sort of self- 



similarity of the community structure. For the Web the 
two levels are more dissimilar and the distributions have 
a clear power law decay (with different exponents) up to 
a cutoff, which is approximately the same for both curves 
(- 2000 vertices). 



4' Dynamic datasets: the US air transportation network 

For the last application, we used a time-stamped 
dataset, the US air transportation network. The data 
can be downloaded from the Bureau of Transportation 
Statistics (US government) ^ http : / / www . bt s . gov ) . Ver- 
tices are airports in the USA and edges are weighted by 
the number of passengers transported along the corre- 
sponding routes. In Fig. 16 we show the geographical 
location of the airports and their communities, indicated 
by the symbols, for three snapshots, corresponding to the 
traffic in March, June and September 2009, respectively. 
We remind that for dynamical datasets we usually take 
the partition/cover V{t) of the system at time t, and we 
use it as initial partition/cover for the topology of the 
system at time t + At, which is then refined by OSLOM, 
in order to "adapt" V{t) to the current structure. This 
is done to exploit the information of more snapshots at 
the same time. Since the three maps of Fig. 16 are 
mostly illustrative, communities were derived by apply- 
ing directly OSLOM to the corresponding snapshots, for 
simplicity. The diagram indicates the similarity between 
networks and their corresponding partitions/covers in 
different snapshots. Each snapshot represents the whole 
traffic of one trimester, which corresponds to a season, 
while At = 1 year, as we want to measure the variation 
of the network structure in consecutive seasons. The sim- 
ilarity between partitions/covers is computed with the 
normalized mutual information, as usual. The similar- 
ity of two weighted networks like the ones at study is 
measured in the following way. First, one computes the 

distance dt^t+At between the matrices W and W^+^^: 



t^t+At = \/Eij{Wlj - Wl^^y. The matrix is de- 



rived from the standard weight matrix by dividing 
each edge weight by the sum of all edge weights. This 
is done because the traffic flows tend to increase steadily 
in time, so comparing the original weight matrices is not 
appropriate. The quantity dt^t+At is a dissimilarity mea- 
sure. We turn it to a similarity index by changing its 
sign, adding a constant and rescaling the resulting val- 
ues. Since we wish to compare the trend of the network 
similarity with that of the partition/cover similarity, the 
additional constant and the rescaling factor are chosen 
such to reproduce the average and the variance of the 
curve of the normalized mutual information. After this 
operation, the two trends are finally comparable. The 
diagram shows that both measures follow a yearly peri- 
odicity, with peaks corresponding to the winter season, 
which is then more stable than the others. 
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Figure 16: Application of OSLOM to real networks: US airport network. The maps show the position of the airports, which are 
represented by symbols, indicating the communities found by applying OSLOM directly to the corresponding network, without 
exploiting the information of previous snapshots. The diagram shows the "seasonality" of air traffic. The normalized mutual 
information (diamonds) was computed comparing the cover of the system at time t adjusted by OSLOM on the network at 
time t + At, and the cover obtained by applying OSLOM directly to the system at time t + At. The circles are estimates 
of the similarity of the network matrices of snapshots separated by At (one year). For each year we took four snapshots, by 
cumulating the traffic of each trimester. The most stable networks are typically in winter (vertical lines). 



IV. DISCUSSION 

We have introduced OSLOM, the first method that 
finds clusters in networks based on their statistical sig- 
nificance. It is a multi-purpose technique, capable to 
handle various types of graphs, accounting for edge di- 
rection, edge weights, overlapping communities, hierar- 
chy and network dynamics. Therefore, it can be used for 
a wide variety of datasets and applications. 

We have thoroughly tested OSLOM against the best 
algorithms currently available on various types of arti- 
ficial benchmark graphs, with excellent results. In par- 
ticular, OSLOM is superior on directed graphs and in 
the detection of strongly overlapping clusters. Moreover, 
it is an ideal method to recognize the absence of com- 
munity structure and/or the presence of randomness in 



graphs. In some cases OSLOM returns slightly less accu- 
rate results than other methods, because it finds several 
homeless vertices when communities are fuzzy. This is 
due to the fact that, in the realizations of benchmark 
graphs, it may happen that some vertices end up having 
the same number of neighbors (or even more) in other 
communities than in their own, due to fluctuations, even 
if on average this does not happen. So, the classifica- 
tion of those vertices, imposed by the planted ^-partition 
model, is not justified topologically. This is an important 
general issue that needs to be assessed in the future, to 
avoid systematic errors in the testing procedure. 

OSLOM is a local algorithm, so it respects the na- 
ture of community structure, which is a local feature of 
networks, the more so the larger the systems at study. 
However, the null model adopted to estimate the statis- 
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tical significance of clusters is the configuration model, 
which is global. This is the same null model adopted in 
modularity optimization [63], and is responsible for the 
serious problems of this technique, like its well known 
resolution limit [64 . Therefore we perform an iterative 
cluster search within the clusters found after the first ap- 
plication of the method, by considering each cluster as a 
network on its own. In this way we progressively limit 
the horizon of the part of the network under exploration, 
and we are able to find the smallest significant clusters, 
which are the natural building blocks of the network and 
the basis of its hierarchical community structure. So the 
null model, originally global, gets confined to smaller and 
smaller portions of the graph. The actual resolution of 
the method is thus not due to the null model, but to 
the choice of the threshold P. In this paper we have set 
P = 0.1, which is often used in various contexts and de- 
livers an excellent performance on the benchmark graphs 
we have adopted. Nevertheless, how much a real graph 
deviates from a random graph depends on the specific 
system at hand, and it would be more appropriate to 
estimate the threshold P case by case. This is an is- 
sue to consider for future work. We remark that also 
for modularity optimization one could in principle iter- 
atively restrict the null model to the clusters found by 
the method. However, modularity is based on the ex- 
pected value of variables estimated on the null model, 
neglecting random fluctuations, which is why modularity 
can attain large values on specific partitions of random 
graphs [65-67 . OSLOM instead accounts for those fluc- 
tuations, so it is far more reliable, in this respect. Fur- 
thermore OSLOM is a local method, so it does not suf- 
fer from the severe problems coming from modularity's 
global optimization [68] . 

Another important aspect to emphasize is the need to 
perform many iterations, to get more accurate results. 
This is not a specific feature of OSLOM, but it should 
be done for all community detection techniques with a 
stochastic character, like methods based on optimization 
(e. g., modularity optimization). In the literature there 
is the general attitude to perform a single iteration, and 
to reduce the complexity of an algorithm to the time 
required to carry out one iteration. But this is not ap- 
propriate, especially on large networks. For instance, by 
performing a single iteration, vertices lying on the bor- 
der between clusters may be assigned to a specific cluster, 
while in many cases they are overlapping. By combining 
the results of several iterations, instead, it is more likely 
to distinguish overlapping vertices from the others. Fur- 
thermore, one can compute the strength of the member- 
ship of vertices in different clusters, from the frequency 
with which they were classified in each cluster. One can 
also disambiguate stable from unstable clusters, which 
could be recovered from specific iterations. So, it is cru- 
cial to collect and combine the results of many iterations. 
Of course, the complexity of the method grows with the 
number of iterations, but it can be considerably reduced 
by distributing runs among many different processors, if 



large computer clusters are available. 

The running time of OSLOM is dominated by the ex- 
haustive search of significant vertices, inside and outside 
the clusters. This search could be carried out with greedy 
approaches, with a huge computational advantage, and 
this is an improvement we plan to implement in the near 
future. On the other hand, if one wishes to attack very 
large graphs, OSLOM could be used at a second stage, as 
a refinement technique, to clean the results of an initial 
partition delivered by a fast algorithm. In this case, since 
the initial clusters are usually cores or parts of the signif- 
icant clusters we are looking for, OSLOM converges far 
more rapidly than its direct application without inputs. 
We have seen in the previous section that, by combining 
OSLOM with the Louvain method by Blondel et al., we 
were able to handle systems with millions of vertices. 

We have proposed a recipe to deal with the increas- 
ingly more important issue of detecting communities in 
dynamic networks. The idea is to take advantage of the 
information of different snapshots at the same time, by 
"adapting" the partition/cover of the earlier snapshot to 
the topology of the other one. In this way it is possible 
to uncover the correlation between the structures of the 
system at different time stamps. 

We have shown the versatility of OSLOM by apply- 
ing it to various networked datasets. OSLOM provides 
the first comprehensive toolbox for the analysis of com- 
munity structure in graphs and is an ideal complement 
of existing tools for network analysis. The algorithm, 
with all its variants (including a fast two-step proce- 
dure for the analysis of very large networks) is imple- 
mented in a freely downloadable and documented soft- 
ware (http://www.osloin.org). 
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Appendix A: Numerical estimation of the internal 
connection probability 

The assessment of a cluster's significance given the 
null (configuration) model relies on the estimation of the 
probability described in Eq. [l] This function has to be 
evaluated many times along the execution of OSLOM in 
order to clean up each cluster and to evaluate the clus- 
ters at the different hierarchical levels. We explain here 
how the values of the distribution function can be esti- 
mated or approximated in a practical implementation of 



18 



OSLOM. 

For convenience, we rewrite the equation here 



-fcr 



(Al) 



p{kT\i,c,g) = A 



While estimating the value of the probability of Eq. |A1| 
for a certain /c-^, the most computationally expensive 
part is the evaluation of the normalization factor A. In 
fact, this would force us to evaluate the rest of the for- 
mula for all the allowed values of kf^ and add up the re- 
sult. A simple way out of this problem is to approximate 
the distribution by another whose normalization factor is 
known. To do so, we can think of a slightly different null 
model, in which the edges are still drawn at random and 
the formation of self-loops is admitted. This is actually 
the null model on which the definition of modularity is 
based [40^ . In such model, the equivalent of Eq. Al be- 
comes an hypergeometric function that is much easier to 
estimate (see [3Tj). Both distributions, that of Eq. Al 
and the hypergeometric, provide close numerical values 
for the same /c*^, except if the probability of generating 
self-loops in the null model is high. The probability that 
reshuffling the connections at random a stub of vertex i 
connects to another stub of the same vertex, is given by 
/2M. In the software implementation of OSLOM, the 
hypergeometric approximation for Eq. |Al| is used as long 
as kf/2M < 1. Otherwise, we directly measure A from 
Eq.[AT 



Appendix B: Extension of the method to weighted 
networks 



In the main text, it is briefly discussed how to extend 
OSLOM to weighted graphs. We mention also that some 
of the technical issues, such as combining both and 
rt, are not trivial. This procedure is described here in 
further detail. 

Remember that we start from an ansatz for the distri- 
bution of the weights in the null model. The distribution 
of the probability of having a certain weight on the edge 
joining vertices i and j was assumed to be 



p{wij > x\ki,kj,Si,Sj) =e-x.-^{-x/{wij)). 



(Bl) 



The idea behind this expression is that the weight of an 
edge is proportional to the average weight of its endver- 
tices {{wi =)si/ki and (wj) = Sj/kj). We proposed the 
harmonic average because it is more sensitive to small 
values of (wi). Our goal is to define a fitness function r 
which has to be a uniform random variable on our ran- 
domized weighted network. And we want to combine 
the fitness function depending on the topology with one 
depending on the weight distribution in order to detect 
meaningful fiuctuations in any of them. 

Let us consider a vertex i which has / connections with 
a given subgraph C (not including i). For the topological 
part, we have already computed the probability that i 



shares / or more edges with vertices of C (Eq. |A1[ ) . We 
call this number r^. Each of the / edges joining i with 
C carries a weight. We consider the corresponding nor- 
malized weight ujs = Ws/ (ws)^ where Wg is the weight on 
the 5-th edge, with s = 1,2,...,/. Since we want a single 
number taking into account all the weights in the set, we 
can simply consider the sum of all the ujg: 



LOs 



(B2) 



S = l 



Q is the sum of / exponentially distributed variables (with 
rate equal to one) and therefore it follows the Erlang 
distribution [69 . Let us call r.,, the cumulative of 1^: 



i-i 



■ p{Q > x) = e ^ ^ x'^/ql 



(B3) 



q=0 



In this way, we managed to define two variables and 
which are both uniformly distributed in the null model. 
Now, we would like to combine these two scores to have 
a final score for our vertex i. Unfortunately this is not 
so simple. We remind that is defined only on the 
Nn neighbors of subgraph C while rt is defined for all the 
A^* = N — nc > Nn vertices out of C, so the two variables 
are defined on samples of different size, in general. A way 
to overcome this difficulty is to scale Vt to an equivalent 
random variable r't defined on a smaller sample. This 
amounts to map each index i in the set 1,2,...,A^* of 
the old variable onto an index j in the set l,2,...,A^n 
of the new variable. Given z, the natural solution is to 
pick the index j such that the cumulative probability Vt\^ 
on the sample of N* vertices coincides (at least with the 
approximation allowed by the specific numerics involved) 
with the cumulative probability Vt^ on the smaller sample 
of Nn vertices. It can be shown that this can be achieved 
with a good approximation (in the limit of j close to Nn) 
with the following rescaling: 



r t = n 



1 



Nn 



(B4) 



Once we computed r't and we need to combine them 
in order to have a single score to rank the vertices. 
We consider the product r't • r^ and the final score 
^tw = vi'f^' t ' < x) = x{l — \ogx). The last expres- 
sion comes from the assumption that the two variables 
are both uniform and independent. The set of variables 
{rtw} is then used to rank the vertices and to compute 
the cumulative probabilities Vt^^ , with A^^ instead of A/"* . 



Appendix C: Further tests on benchmark graphs 

1. Gir van-Newman benchmark 

The benchmark by Girvan and Newman [8] (GN 
benchmark) is a class of graphs with 128 vertices, each. 
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Mixing parameter jLi 



Figure 17: Test on the Gir van- Newman benchmark graphs. 
The variable kout is the average number of external neighbors 
per vertex. The two curves refer to OSLOM (diamonds) and 
Infomap (circles). 



divided into four equal-sized groups. Every vertex has 
expected degree 16 (with a very peaked distribution 
about 16). The (average) number of neighbors of a 
vertex within its group is kin^ whereas the (average) 
number of external neighbors is kout- By construction, 
kin ^ kout = 16. In the language of the planted ^-partition 
model [43], the probability that a vertex is linked to an- 
other vertex of its group is p = kin /SI, the probability 
that a vertex is linked to external vertices is q = kout/^^- 
The condition p > q for the four groups to be communi- 
ties is then equivalent to kout ^12 (this does not account 
for random fluctuations, though j3Ql [3T] ). 

shows the Normalized Mutual Information (in 



Fig. 



17 



the version devised in Ref. [23 ) between the planted par- 
tition of the GN benchmark and the partition found by 
the algorithm as a function of kout- As a term of com- 



parison we used again Infomap [49^. Fig. 17 shows that 
Infomap is more accurate for low values of kout than 
OSLOM, but its performance drops rapidly for kout ^ 6, 
whereas OSLOM shows a slower decay. 

OSLOM is slightly worse than Infomap because it finds 
several home less ver tices, as we explained in the main 
text (Section [mXT] ). 



2. Weighted LFR benchmark 



In Figs. [18] and [19] we report the comparative anal- 
ysis of OSLOM and Infomap on weighted LFR graphs. 
To build the weighted benchmark graphs [42 one needs 
two additional parameters: the exponent /3 of the rela- 
tion between the strength of a vertex and its degree (the 
strength of a vertex is the sum of the weights of the edges 
incident on the vertex); the weighted mixing parameter 
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Figure 18: Test on weighted LFR benchmark graphs (undi- 
rected and without overlapping communities). The parame- 
ters are: N 5000, (k) 20, kmax 50, n = 2, T2 = 1, 
/3 = 1.5. Each panel corresponds to a given value of the topo- 
logical mixing parameter /Jt and of the community range (S 
or B). 



/i^, which is the natural extension to weighted networks 
of the topological fi (that here we call fit), i.e. it is the 
ratio between the sum of the weights on the edges joining 
a vertex to its neighbors in different communities and the 
strength of the vertex. In the analysis, we fix the value of 
the topological mixing parameter /i^ and see how the nor- 
malized mutual information varies as a function of jii^. 



In Fig. 18 the benchmark graphs consist of 5000 vertices, 
and we consider the usual two ranges of community sizes 
(S and B). In Fig. [19] the graphs consist of 50000 ver- 
tices, and we consider a single, but much wider, range 
of community sizes (from 20 to 1000). When jj^t — 0-5 
or lit = 0.6, we find that OSLOM detects the right clus- 
ters for any value of /i^^, for N = 5000, which is truly 
remarkable, while Infomap is unable to find the partition 
for fi^ > 0.6. OSLOM's striking result comes from the 
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Figure 19: Test on weighted LFR benchmark graphs (undi- 
rected and without overlapping communities). The parame- 
ters are: N 50000, (k) 20, kmax 200, n = 2, T2 = 1, 
P = 1.5. Each panel corresponds to a given value of the topo- 
logical mixing parameter jj^t . The range of community sizes is 
[20,1000]. 



Figure 20: Test on directed LFR benchmark graphs (un- 
weighted and without overlapping communities) . The param- 
eters are: (k) - 20, kmax — 50, nn = 2, T2 = 1. Each panel 
corresponds to a given network size {N = 1000, 5000) and 
community range (S or B). The mixing parameter /x refers to 
in- degree. 



fact that the score rtw of a vertex on weighted graphs 
is given by the product of two numbers, t he topo logical 
score r't and the weight score (Section II D 5). If fit 
is not too large, the topological term r't is very low and 
brings down the whole score rtw^ which remains signifi- 
cant for any choice of the weighted mixing parameter /j^^j . 
Basically, OSLOM is able to recognize the right clusters 
from the topology alone. When jut = 0.5 or jut = 0.6 and 
N = 50000, OSLOM maintains an excellent performance 
for the whole range of /i^^, while Infomap again fails for 
^ 0-6. For jj^t = 0-7 the performances of the two algo- 
rithms worsen and OSLOM is still superior, though the 
results are essentially comparable for both network sizes. 
For /it = 0.8 Infomap is more accurate than OSLOM, 
when N = 5000, while both methods are not very good 
However, from Figs. 
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it IS 



when N 50000. 
apparent that OSLOM works the better, the larger the 
network size. So, on very large networks {N ^ 50000) we 
expect that OSLOM has a comparable or superior per- 
formance than Infomap for every pair of values (/i^, /i^^). 
We also infer that the performance of both algorithms 
worsens if clusters are on average larger. 
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Figure 21: Test on directed LFR benchmark graphs (un- 
weighted and without overlapping communities) . The param- 
eters are: (k) — 20, kmax — 200, Tin — 2^ T2 — 1. We consider 
two large network sizes: N = 50000 (left) and N = 100000 
(right). The range of community sizes is [20, 1000]. The mix- 
ing parameter /i refers to in-degree. 



3. Directed LFR benchmark 



Figs. |20] and [21] show the results of the test on di- 
rected LFR graphs This time we have to dis- 
tinguish between in-degree (number of incoming edges) 
and out-degree (number of outgoing edges) of a vertex. 
The in-degree distribution is taken to be a power law, 
with exponent r^n, whereas the out-degree is the same 



for all vertices, for simplicity. The mixing parameter ji 
expresses the ratio of the number of in-neighbors of a 
vertex belonging to different clusters and the total num- 
ber of in-neighbors of the vertex. The in-neighbor of a 
vertex i is any vertex j connected to i by an edge going 



from j to i. Figs. [20] and [21] tell us that OSLOM out- 
performs Infomap, especially when communities span a 
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Figure 22: Application of OSLOM to real networks: Zachary's 
karate club. 



Figure 23: Application of OSLOM to real networks: Lusseau's 
social network of bottlenose dolphins. 



broader range of sizes. The performances of both algo- 
rithms slightly worsen on larger networks. 



Appendix D: Real-world systems 

1. Zachary karate club 

The famous karate club network of Zachary [70] is a 
standard benchmark in community detection. Vertices 
are members of a karate club in the United States, who 
were monitored during a period of three years. Edges 
connect members who had social interactions outside the 
club. After some time, a conflict between the club presi- 
dent and the instructor caused the fission of the club in 
two separate groups, supporting the instructor and the 
president, respectively. In Fig.[22]we see the community 
structure found by OSLOM. It indeed finds two commu- 
nities, plus a homeless vertex (12). Vertex 3 is shared 
between the two clusters, as it has several neighbors in 
both groups. We shall illustrate overlapping and home- 
less vertices with stars and triangles, respectively. The 
communities coincide with the ones observed by Zachary 
with the exception of vertices 3 and 12, which Zachary 
put with the squares. However, vertex 3 is overlapping, 
so it belongs to both clusters, which seems quite reason- 
able by looking at the figure. Also, vertex 12 is homeless 
due to its loose relationship with its group (it has only 
one neighbor). 



2. Dolphin social network 



Zealand). The network was compiled by Lusseau [7T| . 
Vertices of the network are dolphins and two dolphins are 
connected if they were seen together more often than ex- 
pected by chance. The dolphins separated in two groups 
after one of them left the place for some time. OSLOM 
finds two communities, with five overlapping vertices (2, 
8, 20, 29, 31), plus two homeless vertices (40, 61), which 
are very loosely connected to the rest of the graph. All 
vertices which are uniquely assigned to the same group 
(indicated by the same symbol, square or circle, in the 
figure) are classified in the same community by Lusseau 
as well. 



3. American college football 

Another well known benchmark in community detec- 
tion is the network of American college football teams, 
compiled by Girvan and Newman [8 . It comprises 115 
vertices, representing Division I-A colleges. Edges corre- 
spond to games played by the teams against each other 
during the regular season of fall 2000. The teams are 
divided into 12 conferences. Games between teams in 
the same conference are usually (but not always) more 
frequent than games between teams of different confer- 
ences, so there is a organization in clusters where commu- 
nities correspond to conferences. In Fig. 24 we see that 



Fig. 23 presents OSLOM's results for the network 
of bottlenose dolphins living in Doubtful Sound (New 



OSLOM finds three hierarchical levels. The lowest level 
consists of 11 clusters and 5 homeless vertices. There are 
no overlapping vertices. Six clusters correspond exactly 
to the conferences, three others match the conferences 
up to one vertex, one up to two vertices, the last cluster 
along with the homeless vertices mostly mix teams of the 
conferences Sun Belt and Independents. The latter is not 
a proper conference, whereas Sun Belt includes colleges 
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Figure 24: Application of OSLOM to real networks: American college football network. 




Figure 25: Application of OSLOM to real networks: metabolic network of C. elegans. 



which are geographically very spreadout, so they happen 
to play quite often games with the other teams, resulting 
much more mixed with them than teams of other confer- 
ences. Interestingly, in the second hierarchical level we 
find two large communities (plus four homeless teams), 
corresponding quite well to a geographical separation of 
the colleges in East and West. 



of metabolites involved in at least one biochemical re- 
action. OSLOM finds two hierarchical levels, the lower 
with 25 clusters, the higher with 3 (but one of them is 
much smaller than the other two). The fraction of home- 
less vertices in the lower level is larger than 20% (see 
Table 1) and the network appears rather "noisy". 



4. C. elegans metabolic network 



Fig. 25 presents the community structure of the 
metabolic network of C. elegans. The network has been 
compiled by Duch and Arenas [72] and it has been of- 
ten used in applications of community detection algo- 
rithms. Vertices are metabolites and edges connect pairs 
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